Species-specific SNP arrays for non-invasive genetic monitoring of a vulnerable bat

Genetic tagging from scats is one of the minimally invasive sampling (MIS) monitoring approaches commonly used to guide management decisions and evaluate conservation efforts. Microsatellite markers have traditionally been used but are prone to genotyping errors. Here, we present a novel method for individual identification in the Threatened ghost bat Macroderma gigas using custom-designed Single Nucleotide Polymorphism (SNP) arrays on the MassARRAY system. We identified 611 informative SNPs from DArTseq data from which three SNP panels (44–50 SNPs per panel) were designed. We applied SNP genotyping and molecular sexing to 209 M. gigas scats collected from seven caves in the Pilbara, Western Australia, employing a two-step genotyping protocol and identifying unique genotypes using a custom-made R package, ScatMatch. Following data cleaning, the average amplification rate was 0.90 ± 0.01 and SNP genotyping errors were low (allelic dropout 0.003 ± 0.000) allowing clustering of scats based on one or fewer allelic mismatches. We identified 19 unique bats (9 confirmed/likely males and 10 confirmed/likely females) from a maternity and multiple transitory roosts, with two male bats detected using roosts, 9 km and 47 m apart. The accuracy of our SNP panels enabled a high level of confidence in the identification of individual bats. Targeted SNP genotyping is a valuable tool for monitoring and tracking of non-model species through a minimally invasive sampling approach.


Sexing markers
Four putative sex chromosome markers were previously identified for M. gigas in Ottewell et al. 38 .To improve efficiency of data handling and to reduce processing time, we developed TaqMan probes to sex scats on a realtime PCR (qPCR) machine.Multiplexed probes were designed using the PrimerQuest™ online tool (https:// sg.idtdna.com/ Prime rquest/ Home/ Index).
We tested and optimised the new probes using one male tissue and three scats with three different concentrations of primer/probe mixes (0.5, 1.0, and 2.0 μM, Supplementary 2 Fig. S1).Then, we tested sex allocation consistency with eight tissue samples (four females and four males) and six scats (Supplementary 2 Fig. S2).Zfy was excluded in the second trial and subsequent runs because DDX3Y and SRY provided sufficient data.
Table 1.Macroderma gigas SNP filtering steps to select potential loci from DArTseq for MassARRAY panels.www.nature.com/scientificreports/ The final reaction consisted of primer and probe mix (1.0 μM) amplified in 10 µL reactions using the Prime-Time™ Gene Expression Master Mix (Cat No: 1055772) as per the manufacturer instructions with an annealing temperature of 60 °C, 40 amplification cycles and 4 µL of faecal DNA.The reactions were run on the CFX96™ Real-Time System C1000 Touch Thermal Cycler (BIO-RAD, Singapore) and analysed using the CFX Maestro software (BIO-RAD, Singapore).

Case study
Scats (n = 209) were collected from seven roosts in West Angelas between 15-19 October 2019 (Fig. 1).At each roost, a maximum of 20 fresh to moderately fresh M. gigas scats were collected from tarpaulins placed on the cave floor for several months.We chose single scats that were not touching others to avoid any cross-contamination, and placed each into an individual envelope.Samples were kept frozen until DNA extraction and every scat was handled separately onwards using sterile laboratory and handling procedures.DNA was obtained from the scats by scraping the outer surface of frozen scats with a blade.Scraped material was processed using the Omega Biotek Mag-Bind Stool DNA 96 kit following the manufacturer's instructions with a modification of using 50% diluted elution buffer in the final elution step to reduce EDTA inhibition for downstream analyses.Samples were eluted in 100 μL elution buffer.
We employed a two-step genotyping protocol.In the first step, we genotyped all samples with the first SNP panel (Panel 1, 47 SNPs) to identify unique individuals (P ID analysis indicates 20 markers were needed to separate related individuals; "Results").We were unable to obtain matched tissue and scat samples to evaluate the performance of the panel on invasively-collected vs non-invasive samples due to ethical considerations as the species is highly sensitive to disturbance.Instead, we re-genotyped 10% of samples (23 randomly selected samples) with Panel 1 to ensure consistency across runs and allow calculation of allelic dropout error rate.Rather than using the multi-tubes approach employed previously for microsatellite studies, we accounted for genotyping error estimates from re-genotyping in the clustering analysis (i.e.2.5% from allelic dropout × 47 loci is 1.2 SNP).This approach is much cheaper than a multi-tubes approach and the data is handled based on scat quality.MassARRAY SNP results were processed in a custom R package 'ScatMatch' 45 .ScatMatch contains several custom functions and visualisations for assessing genotype quality and clustering of samples based on their genotype dissimilarity.First, scat genotypes are 'cleaned' based on sample and locus amplification rates (typically 90% and 80%, respectively) to retain only high-quality data.Samples with low amplification rates (Supplementary 3 Fig.S1) are likely to be older and contain more errors 4,13,14 .ScatMatch then employs hierarchical cluster analysis using the R package 'stats' with the function 'hclust' and the method "average" 42 to group scats based on the number of allele mismatches.Each scat genotype is assigned initially to its own cluster, and then the algorithm proceeds iteratively to join the two most similar clusters, and continues until only one cluster is left.The clustering is visualised as a dendrogram and we determined the cut height h (essentially the number of allele mismatches) at which to accept clustered samples are from the same individual.This threshold is decided based on inspection of several data visualisations.An elbow graph ('elbow_plot' function) shows the numbers of groups or putative individuals identified with increasing numbers of allele mismatches, and with the clustering threshold determined by the point at which the number of groups stabilises.We looked for a clear distinction in the number of SNP differences between grouped and ungrouped genotypes when the pairwise dissimilarity matrix is visualised as a heat map ('heat_plot' function).Lastly, we assessed the clustering threshold from the misassignment graph ('misassign' function).The graph plots the frequency of allelic differences within groups (SNP differences among scats from the same individual) and between groups (SNP differences between scats from different individuals) assuming the allelic frequency to follow a binomial distribution.Any allelic differences falling into the within group distribution are likely to be genotyping errors resulting from variation in DNA quality between scat samples from the same individual.Each selected value of h generates an overlap of the upper 0.995 percentile of the within group distribution and the lower 0.005 percentile of the between groups distribution.A greater degree of overlap means a higher probability of misassigning scats to individuals or would indicate the lack of SNP variation for individual identification.
In the second step, after scats had been assigned to individuals by 'ScatMatch' , we selected one sample per bat with the best amplification rate to be genotyped with MassARRAY Panels 2 and 3 to obtain population genetic information.We also selected up to three random samples per bat to determine sex using the TaqMan qPCR assay.Because the amplification of sexing markers from scats is not always consistent due to variable sample quality, we established several criteria for assigning sex.Amplification was considered successful if the qPCR amplification RFU signal was ≥ 50.Samples were considered male if they met additional criteria as follows: a ratio of Y-to X-linked RFU signal > 0.1; and consistently assigned to the same sex in multiple samples.'Likely' sex was defined when a group of samples with a small level of disagreement among markers and/or samples and the sex selected made up the majority of the result.'Undetermined' was defined when a group of samples with amplification signal below 50 RFU from multiple markers or sex could not be assigned with confidence (Supplementary 3 Table S1).
Using all filtered loci from three panels, summary population genetic diversity statistics such as observed heterozygosity (Ho), gene diversity or observed heterozygosity (Hs), allelic richness (A) and inbreeding coefficient (F IS ) were calculated from all bats in the Hierfstat package in R 42 .The contemporary effective population size (N e ) was estimated in NeEstimator 46 using the Linkage Disequilibrium method 46,47 .The parametric method was used to calculate the 95% confidence intervals of each N e estimate 48 .Pairwise genetic relatedness between all bats was calculated using an R package 'Related' 49 .We used the Ritland estimator as recommended by Attard et al. 50for SNP data that gives a value of 1.0 for identical twins or clones, 0.5 for parent-offspring or full-sib, and 0.25 and 0.125 for second and third-order relationships 51 .

Ethical approval
No animals were directly involved during study and only data has been used.The unpublished DArTseq data of Pilbara M. gigas tissue samples were collected under scientific permits issued by the Western Australian Department of Biodiversity, Conservation and Attractions (SF002775, SF003138, SF005423, and SF005774).

SNP selection
Out of 33,340 SNPs, 611 SNPs were identified as suitable, high quality, polymorphic candidate loci for the MassARRAY panels (Table 1).The filtered markers exhibited variation of homozygosity and heterozygosity in the in silico dataset as anticipated (Fig. 2a).A Pearson Principal Component Analysis of the original DArTseq genotypes showed an emergence of population structure pattern when more loci were included (Fig. 2b and Supplementary 3 Fig.S2).At the threshold of 0.0001, P ID analysis of one panel (50 SNPs) showed that at least 10 or 20 loci were required to separate unrelated or related bats respectively (Fig. 2c).Inclusion of a larger number of loci did not change the result of P ID analysis (Supplementary 3 Fig.S3), but it reduced standard deviations in population genetic diversity estimates although actual parameter estimates were consistent across all data subsets (Supplementary 3 Fig.S4).As a balance between per-sample genotyping costs and information content we designed three SNP panels.One MassARRAY SNP panel can contain up to 50 SNPs but due to constraints on multiplex primer design, our initial panels contained a total of 140 SNPs (Panel 1 = 47 SNPs, Panel 2 = 49, Panel 3 = 44, Supplementary 4).

SNP panel performance
First, we compared genotyping success and error rates for seven tissue (20 ng/μL) and 11 scat (concentration unknown) samples.Based on 108 SNP loci with ≥ 50% amplification rate, tissue samples had an average sample amplification of 0.691 ± 0.082 (range 0.34-0.99)and 0.708 ± 0.067 (range 0.0-0.98)for scats.We allowed a lower amplification threshold because we used all three SNP panels for this trial.Average allelic dropout was low for both scat and tissue samples (tissue: 0.0 and scat: 0.038 ± 0.018).We also trialled the effects of DNA concentration on amplification and error rates.We found that the sample amplification rate ranged between 0.89 to 1.0 and the rate increased with DNA concentration but reached a plateau after 0.6 ng/μL (Supplementary 2 Fig. S3).Lastly, we compared sample amplification rates of scat DNA extracted with the Omega Biotek Mag-Bind Stool DNA 96 kit (Omega, USA, Cat No: M4016-01) and the QIAamp ® Fast DNA Stool Mini kit (Qiagen, Germany, Cat No: 51604).The average amplification rates, after removing completely failed samples, were also significantly higher for the Omega kit (0.901 ± 0.007, range 0.537-1.0)compared to the Qiagen kit (0.764 ± 0.013, range 0.293-1.0)(W = 20,605, P-value < 0.001, Supplementary 2 Fig. S4).

Case study
In the first step of our genotyping protocol, all samples were genotyped for Panel 1 to identify individuals represented in the collection of scats.Out of 209 scats and 23 replicates, 13 scats and 2 replicates (6.5%) failed to amplify.The overall sample amplification rate (excluding failed samples) ranged from 0.54 to 1.0, with a mean of 0.90 ± 0.01.Allelic dropout in repeated samples was 0.025 ± 0.004 and was negatively correlated with amplification rate (Supplementary 3 Fig.S1).Evaluation of alternative sample and locus filtering thresholds (Supplementary 3 Fig.S5) clearly shows the impact of data quality on the ability to confidently cluster scat genotypes.Removing lower quality samples/loci results in a clear plateau in the number of individuals identified regardless of the number of SNP mismatches past h = 1 visualised in the elbow graph, and greater distinction between 'within individual' (caused by genotyping errors) and 'between individual' (caused by biological variation) SNP differences (Supplementary 3 Fig.S5).As a result of these visualisations, we chose to filter our data to samples and loci with ≥ 0.9 and ≥ 0.8 amplification rates, respectively, which resulted in 125 scats and 40 loci (Panel 1), and allelic dropout rate reduced to 0.003 ± 0.000.For high quality samples, we identified the threshold SNP mismatch number as h = 1 based on inspection of elbow and misassignment graphs (Fig. 3a,b), with the heat map showing that scats tend to cluster together if there were SNP differences of ≤ 1 (Fig. 3c).Based on these analyses, we assigned scats with ≤ 1 SNP mismatch to the same bat.
Based on stringent filtering criteria, we identified 19 unique genotypes (individuals) from 125 filtered scats and 40 filtered loci.Of the 19 individuals identified, there were eight males, one likely male, five females and five likely females (Supplementary 3 Table S2).This gave a sex ratio of males to females of 1.6:1 for confirmed sex, or 0.9:1 for all bats.We detected between 1 to 8 unique individuals per roost (Fig. 4).Stringent filtering resulted in a loss of 40% of the total number of scats, indicating 19 individuals as a minimum estimate of the number of bats present.Using more relaxed filtering criteria and threshold mismatch number allowed identification of www.nature.com/scientificreports/further individuals (30 to 38 bats using loci/scat amplification rates of 0.8 and 0.7 respectively, Supplementary 3 Fig.S5), however, lowered data quality reduces confidence in these identifications.
Of 19 bats, we only detected two individuals (M8, M10) using more than one roost during the sampling period (Supplementary 3 Table S2) with the remaining bats only detected using a single roost.M8 was detected in roost AA1 and L3, 9.2 km apart while M10 was detected in roost AA2 and A1, 47 m apart (Fig. 1).Based on the number of scats, we infer that bats F likely 5 (No. scats = 8), M6 (13), F7 (19), M8 (37), F9 (17), and M10 (7) spent significant time in the study area and were likely to be resident bats.While both M8 and M10 were detected in multiple roosts, based on the number of scats, M8 spent most of his time in roost AA1 while M10 spent a similar amount of time in both roosts (Supplementary 3 Table S2).
Roost AA1 had the largest number of scats and the highest number of bats detected (Fig. 4).Many of these bats were females or likely females, suggesting this cave may be a maternity roost.Interestingly, despite the likelihood of roost AA1 being used by breeding females, resident bats with many scats collected consisted of both sexes (Supplementary 3 Table S2).The remaining caves were clustered spatially, with only 3 bats detected using cluster L3/AA2/A1, including two resident bats F9 and M10.In contrast, 9 individuals were detected using cave cluster N22/N13/N21, however, only low numbers of scats were detected for each individual (n = 1-3) possibly indicating visits to these caves were occasional.Following identification of individual bats, representative scats from each individual were genotyped with all 93 loci on SNP Panels 2 and 3 to obtain population genetic diversity metrics.Loci were then filtered to those with ≥ 50% amplification rate (n = 74) before combined with loci from Panel 1 (n = 40) resulted in a total of 114 filtered SNPs.The group of bats detected in this site had an average observed heterozygosity of 0.40 ± 0.02 and expected heterozygosity 0.46 ± 0.01 indicating a moderate level of inbreeding (F IS 0.14 ± 0.03).We anticipated that genetic relatedness may be higher in the maternity roost if there is maternal philopatry, implying that female offspring return to their natal cave to reproduce, as has been suggested in past population genetic studies 30,34 .Despite having more females in some roosts, average genetic relatedness was low in all roosts (overall mean R = − 0.04) and not significantly different between roosts (Fig. 4).All roosts consisted of unrelated individuals except one pair of bats from roost AA1 with a relatedness value closer to parent-offspring or full-sibling relationship and another similarly related pair located in AA1 and L3 (Fig. 5).The effective population size was estimated as 52.6 (CI 35.8-92.6).

Discussion
Genetics-based non-invasive sampling is recognised as an efficient and cost-effective tool for monitoring rare and elusive species 52 .Mark-recapture approaches to estimate population abundance have relied traditionally on live capture, which risks injuries or death from capture and handling and may perform inadequately for species exhibiting trap avoidance behaviour 35 .The usefulness of faecal genotyping with microsatellite markers has been shown previously for addressing these issues for ad hoc monitoring of M. gigas populations 38,39 , however, the limited number of markers and higher error rates associated with this marker type suggest they may be problematic for use in statistical mark-recapture analyses.Transitioning from microsatellite markers to pre-selected SNP markers offers increased power to distinguish related individuals, improve faecal genotyping efficiency and reduce error rates.We developed and successfully tested 114 novel high-quality SNPs for automated genotyping of faecal DNA on the MassARRAY platform using a two-step genotyping protocol to firstly identify individuals from scats using a panel ('Panel 1') of 40 high-quality SNPs, and then estimating population genetic parameters (relatedness, genetic diversity and effective population size) from the total SNP set.Combining individual assignment and sexing information, we answered relevant demographic and ecological questions using scats collected across an 83 km 2 study site in the eastern Pilbara.Below, we discuss the development of this technology in comparison to other studies, and in particular how the information gained from this technology provides insight into the biology of M. gigas.Lastly, we discuss limitations and possible future research directions of faecal genotyping.

Performance of SNP panels for faecal DNA genotyping
Scat DNA analysis is often impacted by low DNA quality and quantity, which can lead to errors in assigning scats to individuals, and introduce biases into mark-recapture analyses 53 .We demonstrated that M. gigas scats produced a high locus amplification rate (90%) with very low genotyping error rates (< 2.5%), which was observed in both the trial and the case study at West Angelas.The average amplification and error rates in this study are Note that some bats used multiple roosts so they have been double-counted across these roosts.Likely sex is defined as a group of scats with some disagreement between markers and/or scats, but the sex selected made up the majority of the result.Bars around relatedness mean are standard error bars.
Vol.:(0123456789)  54 ; amplification rate 60%, ADO < 1% in pumas 55 ).Low DNA concentration (< 0.6 ng/μL) negatively impacted the sample amplification rate in our trials, however we did not see a consistent pattern of increasing allelic dropout in our data, with low rates across all DNA concentrations tested (Supplementary 2).Microsatellite studies typically show a negative relationship between genotyping errors and amplification rate, with the rate of allelic dropout increasing with greater scat age (assumed to be a result of low DNA quantity and/or quality 4 ).The better performance of M. gigas scats could be due to environmental factors since caves provide a stable microclimate where scats are sheltered from UV light and rain compared to those exposed to ambient weather conditions 4,5,16,17 .In addition, we found an interaction between the DNA extraction method used and MassARRAY amplification rate with the QIAamp ® Fast DNA Stool Mini kit performing more poorly than the Omega Biotek Mag-Bind Stool DNA 96 kit, most likely due to the presence of EDTA in buffers interfering with MassARRAY multiplex PCR reactions.Overall, we found 94% of faecal samples in our empirical study produced genotypes, but after stringent filtering we retained only high-quality genotypes for analysis, representing a final success rate of 60%.Notwithstanding this relatively low rate, stringent filtering enabled us to match genotypes and identify individuals with a very high level of confidence (allowing 0-1 allelic mismatch between samples).Such high precision also builds trust in an ability to identify recaptures unambiguously within a mark-recapture framework.Allowing greater levels of missing data and a higher mismatch threshold between scats increased the number of scats available for analysis (81-89%) and doubled the numbers of individuals detected.However, relaxing data quality thresholds can increase misassignment of scats to individuals, either by failing to discriminate between scats from related individuals (overmerging) or by differentiating between scats from the same individual due to genotyping errors (oversplitting).Mark-recapture analysis relies on accurate identification of individuals to estimate recapture parameters; consequently misassignment of genotype IDs can have significant implications for estimates of population size made from such studies 56 .
Genotyping error rates have been reported in the range of < 1-49% in microsatellite studies 38,56,57 but are typically lower for SNP arrays 18 .Pre-selection of SNP markers enables high performing bi-allelic loci to be targeted, and automated SNP allele calling on array platforms is less prone to genotype scoring error introduced by humans.The use of bi-allelic markers also reduces the impact of genotyping artefacts ('false alleles') that are often detected in microsatellite studies 58 .Improved control over the performance of genetic markers and consistency in genotype calling permit the increased confidence in identification of individuals for mark-recapture analysis.We also found that sexing markers were more sensitive to DNA degradation from a few inconsistent callings in scats assigned to the same individual.This could be due to null results or allelic dropouts, which were more frequently reported in fox faecal sexing markers than autosomal markers 15 .www.nature.com/scientificreports/Automated SNP genotyping is efficient for individual identification since pre-selection of highly variable SNP makers requires only a small number of markers to differentiate related individuals 18,52 .From simulated data, Glaubitz et al. 21showed that 22-28 biallelic loci are required to achieve an exclusion probability greater than 0.99 if these loci have a minimum allele frequency between 0.25 to 0.50.Similarly, here we found that 20 SNPs (frequency > 0.4) are required to differentiate related individuals (P IDsib < 0.0001).Our simulations however showed that at least 100 SNPs are needed for reliable genetic diversity estimates and population structure analysis.A further limitation of using high frequency, highly variable SNP markers is, whilst cost-efficient, they will overestimate genetic diversity statistics and underestimate effective population size 59 .Such markers can be less sensitive to genetic diversity changes and hence may be less informative for conservation-related questions.

Faecal genotyping reveals M. gigas biology
Our minimum estimate of 19 bats at the West Angelas study site is comparable to previous surveys at the same site where we detected 12 bats in 2015 and 24 bats in 2017 using microsatellite markers 60 .Consistent with previous observations of M. gigas across the Pilbara 31,60 , we detected low numbers of individuals per roost (1-8 bats per cave) with the largest number of bats detected in roost AA1 with a bias towards females (6F:2M) in this cave, suggesting it is likely be a maternity roost.Roost AA1 is one of the largest in the area and has characteristics of a maternity roost with a dome ceiling located at the back of the cave creating dark, warm and humid microclimates 31 .Breeding activity has also been previously reported at this site 31 .Despite a general expectation of female philopatry 30 , we detected only one close to first-order pedigree relationship (parent-offspring or sibling) pair in roost AA1.The average genetic relatedness was similar to other roosts and the incidence of first-and second-degree related individuals was not higher than in other roosts.
Two male bats were detected moving between roosts.Male 8 travelled 9 km between roosts AA1 and L3, but spent most of its time in roost AA1; while Male 10 only travelled between nearby caves AA2 and A1 that are 47 m apart, and appeared to use both roosts equally.Tracking studies of bats indicate movements of 10-15 km during foraging bouts 37 or between diurnal roosts 61 , although Toop's 35 records of marked M. gigas, indicate longer dispersal distances between 20 and 50 km, with one bat travelling as far as 150 km, suggesting a high capacity for dispersal in this species.Likewise, recapture analyses from scats have detected inter-roost movements up to 36 km apart, with an average of 4.0-8.6 km [62][63][64] , although many bats are found to persist in the same roosts over multiple sampling periods, consistent with the observations in this study and with tracking studies showing bats consistently returning to the same diurnal roosts after foraging 37,61 .
Our genetic diversity estimates revealed a moderate level of inbreeding (F IS ) for the population of M. gigas at West Angelas.Despite this observation, we found that mean relatedness within roosts was typically low (most individuals were unrelated) with only two individuals detected close to first-order pedigree relationships (parent-offspring or full-sib).Previous microsatellite analysis has indicated that genetic diversity in the Pilbara population of M. gigas is moderately high (Ho = 0.816), albeit lower than in colonies in the Northern Territory (Ho = 0.874-0.928;Worthington Wilmer et al. 30 ).Further sampling of Pilbara roosts is required to place the population genetic diversity estimates from West Angelas obtained from SNP analysis into a regional context.

Limitations and future directions
Mark-recapture analysis using non-invasive samples has been applied successfully to estimate colony size in bat species Myotis sodalis 65 , Rhinolophus hipposideros 66 , as well as to estimate annual changes in population size in iconic species such as the brown bear Ursus arctos 9 and the Eurasian otter Lutra lutra 57 , demonstrating the viability of genetic mark-recapture approaches.Our single-session field survey case study demonstrated the ability of our novel SNP genotyping approach to confidently identify M. gigas individuals and their recapture using faecal DNA samples.Further monitoring sessions are required to establish a robust design statistical mark-recapture monitoring approach to estimate population abundance using a combination of closed and open sessions.Nevertheless, results from our single-session survey provided insight into aspects of M. gigas biology that can inform survey design and choice of population models in future studies, including confirmation of a high rate of cave fidelity and the spatial scale of inter-roost movements.Further, application of molecular sexing markers provides additional information that may enable application of sex-specific mark-recapture models, incorporating heterogeneity in recapture rates amongst sexes.
We designed our assay specifically for the Pilbara population of M. gigas, yet the species also persists in disjunct, threatened populations elsewhere in northern Australia.SNP panels are known to be impacted by ascertainment bias since SNP selection is typically made based on the allele frequencies of only a subset of individuals or populations that are available to study, rather than the global population 67,68 .Whilst we only had genomic data from eight colonies of M. gigas, these spanned the two main geographic sub-regions in the Pilbara (Chichester and Hamersley).We also have a priori knowledge of low genetic structure in M. gigas in the Pilbara 69 suggesting our SNP panels can be effectively applied to non-sampled colonies in the Pilbara with potentially little or minor loss of utility.However, it is likely that our SNP panels will have lowered effectiveness if applied to disjunct populations elsewhere in Australia due to significant variation in allele frequencies amongst regions 30 .As a result, development of region-specific SNP panels may be necessary to apply this mark-recapture methodology elsewhere.Nevertheless, as a disturbance-sensitive Threatened bat species, faecal DNA monitoring, in combination with an appropriate survey design, provides an opportunity for obtaining robust population abundance information to assist in the conservation and management of the species.
Here, we applied stringent filtering criteria to obtain only high-quality samples and genotypes to ensure a high confidence in assigning scats to individuals.Previous reviews of non-invasive mark-recapture approaches 56,64 have argued that stringently filtering genotype data could inadvertently bias individual recapture probabilities, violating mark-recapture models since individuals shed varying amounts of DNA 70 .Lukacs and Burnham 53 suggest that allowing a small amount of genotyping error to account for heterogeneity in an individual's capture probability 53 .In addition, acknowledging that genotyping errors are somewhat unavoidable in genetic markrecapture studies, Lampa et al. 57 recommended using specific population-models that account for misidentification error (e.g.Lukacs and Burnham 71 estimator, α, implemented in the program MARK 72 ).
As genomic technology advances, it is increasingly possible to obtain detailed population information on threatened species non-invasively.An exciting prospect in recent years is the use of DNA methylation changes as a biomarker to estimate chronological age in humans 73 and non-model species, including the bat Myotis bechsteinii 74,75 .At this stage such approaches are only effective for tissue samples, however, it would be of interest to assess the application of such techniques to target DNA methylation from cells in the intestinal tract for scats.Adding markers for molecular ageing to our scat assays could provide further valuable information on the age structure of populations and recruitment rates to assist in conservation monitoring.

Figure 1 .
Figure 1.The spatial arrangement of roosts where Macroderma gigas scats were collected.The top insert is the site location within Western Australia and the bottom inserts are movement patterns of M. gigas male number 8 (M8) and male number 10 (M10).The map is generated by an R package 'leaflet' version 2.2.1 76 .

Figure 2 .
Figure 2. Macroderma gigas SNP selection for MassARRAY panels assessing for genotype variation for all 611 SNPs (a), A Pearson Principal Component Analysis of 150 SNPs (b), and Probability of Identity of 50 SNPs (c).Different colours in (a) represent 3 genotypes: blue (0) for a homozygote of the reference allele, purple (1) for a heterozygote, and red (2) for a homozygote of the alternative allele.Colours in (b) represent locations of M. gigas tissue samples collected across the Pilbara.Different P ID colours in (c) represent the probability of identifying unrelated individuals (P ID , green) and related individuals (P IDsib , orange).

Figure 3 .
Figure 3. Threshold of allelic mismatch number to call Macroderma gigas scats from the same individuals with amplification rate above 90% for loci and 80% for scats.(a) Graph of different numbers of SNP mismatches allowed and numbers of group assigned to genotypes.(b) Misassignment graph (h = 1) demonstrates the frequency of pairwise allelic mismatches of raw genotypes where both samples have genotypes.The red and blue highlights represent SNP differences between faecal samples from 'within' and 'between' groups or putative individuals respectively.The red and blue dashed lines represent the upper 0.995 percentile and lower 0.005 percentile respectively.(c) Heatmap of pairwise allelic mismatches between all scats.Scats are clustered based on their similarity.A lower number of mismatches presents blue and a higher number of mismatches presents in yellow.

Figure 4 .
Figure 4. Summary of average pairwise genetic relatedness and sex of Macroderma gigas detected in each roost.Note that some bats used multiple roosts so they have been double-counted across these roosts.Likely sex is defined as a group of scats with some disagreement between markers and/or scats, but the sex selected made up the majority of the result.Bars around relatedness mean are standard error bars.